Overview

Dataset statistics

Number of variables24
Number of observations365369
Missing cells105505
Missing cells (%)1.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory69.7 MiB
Average record size in memory200.0 B

Variable types

Numeric15
Categorical9

Alerts

zipcode is highly overall correlated with lng and 2 other fieldsHigh correlation
target_clean is highly overall correlated with log_target_clean and 3 other fieldsHigh correlation
log_target_clean is highly overall correlated with target_clean and 3 other fieldsHigh correlation
sqft_clean is highly overall correlated with target_clean and 3 other fieldsHigh correlation
log_sqft_clean is highly overall correlated with target_clean and 3 other fieldsHigh correlation
baths_clean is highly overall correlated with target_clean and 3 other fieldsHigh correlation
population is highly overall correlated with housing_units and 1 other fieldsHigh correlation
lat is highly overall correlated with state_zip and 1 other fieldsHigh correlation
lng is highly overall correlated with zipcode and 2 other fieldsHigh correlation
median_household_income is highly overall correlated with mean_ratingHigh correlation
housing_units is highly overall correlated with population and 1 other fieldsHigh correlation
occupied_housing_units is highly overall correlated with population and 1 other fieldsHigh correlation
mean_rating is highly overall correlated with median_household_incomeHigh correlation
state_zip is highly overall correlated with zipcode and 3 other fieldsHigh correlation
major_city_ch is highly overall correlated with zipcode and 3 other fieldsHigh correlation
mlsid_join_bool is highly imbalanced (51.4%)Imbalance
status_cl is highly imbalanced (68.4%)Imbalance
baths_clean has 102041 (27.9%) missing valuesMissing
target_clean is highly skewed (γ1 = 25.64772097)Skewed
sqft_clean is highly skewed (γ1 = 150.3844613)Skewed
sqft_clean has 48935 (13.4%) zerosZeros
log_sqft_clean has 48935 (13.4%) zerosZeros
home_age has 29405 (8.0%) zerosZeros

Reproduction

Analysis started2024-05-16 13:39:20.007802
Analysis finished2024-05-16 13:40:40.585670
Duration1 minute and 20.58 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

zipcode
Real number (ℝ)

HIGH CORRELATION 

Distinct4231
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51624.456
Minimum1104
Maximum99338
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2024-05-16T16:40:40.735671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1104
5-th percentile11235
Q132836
median37205
Q377386
95-th percentile95620
Maximum99338
Range98234
Interquartile range (IQR)44550

Descriptive statistics

Standard deviation26828.688
Coefficient of variation (CV)0.5196895
Kurtosis-1.3445354
Mean51624.456
Median Absolute Deviation (MAD)17189
Skewness0.29467617
Sum1.8861976 × 1010
Variance7.1977851 × 108
MonotonicityNot monotonic
2024-05-16T16:40:40.943671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
32137 2104
 
0.6%
33131 1550
 
0.4%
34747 1474
 
0.4%
78245 1361
 
0.4%
33137 1298
 
0.4%
33132 1295
 
0.4%
78253 1252
 
0.3%
34759 1240
 
0.3%
78254 1211
 
0.3%
33130 1161
 
0.3%
Other values (4221) 351423
96.2%
ValueCountFrequency (%)
1104 10
< 0.1%
1105 7
 
< 0.1%
1106 1
 
< 0.1%
1107 2
 
< 0.1%
1108 13
< 0.1%
1109 18
< 0.1%
1118 8
< 0.1%
1119 7
 
< 0.1%
1128 4
 
< 0.1%
1129 7
 
< 0.1%
ValueCountFrequency (%)
99338 103
< 0.1%
99337 146
< 0.1%
99336 126
< 0.1%
99224 123
< 0.1%
99223 92
< 0.1%
99218 33
 
< 0.1%
99217 80
< 0.1%
99216 24
 
< 0.1%
99212 50
 
< 0.1%
99208 190
0.1%

target_clean
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct34054
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean644634.16
Minimum1
Maximum1.95 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2024-05-16T16:40:41.165666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile39500
Q1189900
median324500
Q3585000
95-th percentile1950000
Maximum1.95 × 108
Range1.95 × 108
Interquartile range (IQR)395100

Descriptive statistics

Standard deviation1835212.7
Coefficient of variation (CV)2.8469058
Kurtosis1393.2279
Mean644634.16
Median Absolute Deviation (MAD)168600
Skewness25.647721
Sum2.3552934 × 1011
Variance3.3680058 × 1012
MonotonicityNot monotonic
2024-05-16T16:40:41.406669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
225000 1775
 
0.5%
350000 1617
 
0.4%
275000 1610
 
0.4%
250000 1595
 
0.4%
325000 1534
 
0.4%
399000 1516
 
0.4%
299900 1507
 
0.4%
249900 1472
 
0.4%
299000 1431
 
0.4%
450000 1418
 
0.4%
Other values (34044) 349894
95.8%
ValueCountFrequency (%)
1 13
< 0.1%
3 2
 
< 0.1%
8 1
 
< 0.1%
20 1
 
< 0.1%
25 1
 
< 0.1%
29 1
 
< 0.1%
30 1
 
< 0.1%
250 1
 
< 0.1%
393 1
 
< 0.1%
400 1
 
< 0.1%
ValueCountFrequency (%)
195000000 1
< 0.1%
165000000 2
< 0.1%
150000000 1
< 0.1%
129000000 1
< 0.1%
115000000 2
< 0.1%
110000000 2
< 0.1%
98000000 1
< 0.1%
88000000 1
< 0.1%
87000000 1
< 0.1%
85000000 1
< 0.1%

log_target_clean
Real number (ℝ)

HIGH CORRELATION 

Distinct34054
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.652744
Minimum0
Maximum19.08851
Zeros13
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2024-05-16T16:40:41.623667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile10.584056
Q112.154253
median12.690041
Q313.279367
95-th percentile14.48334
Maximum19.08851
Range19.08851
Interquartile range (IQR)1.1251142

Descriptive statistics

Standard deviation1.1925631
Coefficient of variation (CV)0.094253317
Kurtosis3.6599259
Mean12.652744
Median Absolute Deviation (MAD)0.56192971
Skewness-0.68997535
Sum4622920.3
Variance1.4222066
MonotonicityNot monotonic
2024-05-16T16:40:41.852669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.32385568 1775
 
0.5%
12.76568843 1617
 
0.4%
12.52452638 1610
 
0.4%
12.4292162 1595
 
0.4%
12.69158046 1534
 
0.4%
12.8967167 1516
 
0.4%
12.61120436 1507
 
0.4%
12.42881612 1472
 
0.4%
12.60819885 1431
 
0.4%
13.01700286 1418
 
0.4%
Other values (34044) 349894
95.8%
ValueCountFrequency (%)
0 13
< 0.1%
1.098612289 2
 
< 0.1%
2.079441542 1
 
< 0.1%
2.995732274 1
 
< 0.1%
3.218875825 1
 
< 0.1%
3.36729583 1
 
< 0.1%
3.401197382 1
 
< 0.1%
5.521460918 1
 
< 0.1%
5.973809612 1
 
< 0.1%
5.991464547 1
 
< 0.1%
ValueCountFrequency (%)
19.08851012 1
< 0.1%
18.92145603 2
< 0.1%
18.82614585 1
< 0.1%
18.67532296 1
< 0.1%
18.56044269 2
< 0.1%
18.51599092 2
< 0.1%
18.40047804 1
< 0.1%
18.29284737 1
< 0.1%
18.28141868 1
< 0.1%
18.25816181 1
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.6 MiB
0
321341 
1
44028 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters365369
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 321341
87.9%
1 44028
 
12.1%

Length

2024-05-16T16:40:42.253673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-16T16:40:42.473670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 321341
87.9%
1 44028
 
12.1%

Most occurring characters

ValueCountFrequency (%)
0 321341
87.9%
1 44028
 
12.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 365369
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 321341
87.9%
1 44028
 
12.1%

Most occurring scripts

ValueCountFrequency (%)
Common 365369
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 321341
87.9%
1 44028
 
12.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 365369
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 321341
87.9%
1 44028
 
12.1%

mlsid_join_bool
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.6 MiB
1
326793 
0
38576 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters365369
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 326793
89.4%
0 38576
 
10.6%

Length

2024-05-16T16:40:42.808671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-16T16:40:42.982667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 326793
89.4%
0 38576
 
10.6%

Most occurring characters

ValueCountFrequency (%)
1 326793
89.4%
0 38576
 
10.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 365369
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 326793
89.4%
0 38576
 
10.6%

Most occurring scripts

ValueCountFrequency (%)
Common 365369
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 326793
89.4%
0 38576
 
10.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 365369
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 326793
89.4%
0 38576
 
10.6%

sqft_clean
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct9841
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2650.1925
Minimum0
Maximum7078574
Zeros48935
Zeros (%)13.4%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2024-05-16T16:40:43.154669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11050
median1666
Q32470
95-th percentile4542
Maximum7078574
Range7078574
Interquartile range (IQR)1420

Descriptive statistics

Standard deviation30587.402
Coefficient of variation (CV)11.541578
Kurtosis30142.676
Mean2650.1925
Median Absolute Deviation (MAD)691
Skewness150.38446
Sum9.6829817 × 108
Variance9.3558917 × 108
MonotonicityNot monotonic
2024-05-16T16:40:43.398697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 48935
 
13.4%
1200 1388
 
0.4%
1000 996
 
0.3%
1500 975
 
0.3%
1800 951
 
0.3%
1100 921
 
0.3%
1400 877
 
0.2%
2000 840
 
0.2%
1600 799
 
0.2%
800 735
 
0.2%
Other values (9831) 307952
84.3%
ValueCountFrequency (%)
0 48935
13.4%
1 71
 
< 0.1%
2 6
 
< 0.1%
3 2
 
< 0.1%
4 1
 
< 0.1%
5 2
 
< 0.1%
6 1
 
< 0.1%
10 2
 
< 0.1%
11 1
 
< 0.1%
12 1
 
< 0.1%
ValueCountFrequency (%)
7078574 3
< 0.1%
5728968 1
 
< 0.1%
4356000 2
< 0.1%
2807917 2
< 0.1%
2613600 1
 
< 0.1%
2585006 2
< 0.1%
1916640 1
 
< 0.1%
1761113 1
 
< 0.1%
1611720 1
 
< 0.1%
1598652 1
 
< 0.1%

log_sqft_clean
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct9841
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.5390758
Minimum0
Maximum15.772583
Zeros48935
Zeros (%)13.4%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2024-05-16T16:40:43.603701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q16.9574974
median7.4187809
Q37.8123782
95-th percentile8.4213429
Maximum15.772583
Range15.772583
Interquartile range (IQR)0.85488084

Descriptive statistics

Standard deviation2.6354149
Coefficient of variation (CV)0.40302559
Kurtosis2.2051768
Mean6.5390758
Median Absolute Deviation (MAD)0.4211849
Skewness-1.9278333
Sum2389175.6
Variance6.9454118
MonotonicityNot monotonic
2024-05-16T16:40:43.815667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 48935
 
13.4%
7.090909822 1388
 
0.4%
6.908754779 996
 
0.3%
7.313886832 975
 
0.3%
7.496097345 951
 
0.3%
7.003974137 921
 
0.3%
7.244941546 877
 
0.2%
7.601402335 840
 
0.2%
7.378383713 799
 
0.2%
6.685860947 735
 
0.2%
Other values (9831) 307952
84.3%
ValueCountFrequency (%)
0 48935
13.4%
0.6931471806 71
 
< 0.1%
1.098612289 6
 
< 0.1%
1.386294361 2
 
< 0.1%
1.609437912 1
 
< 0.1%
1.791759469 2
 
< 0.1%
1.945910149 1
 
< 0.1%
2.397895273 2
 
< 0.1%
2.48490665 1
 
< 0.1%
2.564949357 1
 
< 0.1%
ValueCountFrequency (%)
15.77258317 3
< 0.1%
15.56104614 1
 
< 0.1%
15.28706499 2
< 0.1%
14.84795384 2
< 0.1%
14.77623952 1
 
< 0.1%
14.76523877 2
< 0.1%
14.46608473 1
 
< 0.1%
14.38145712 1
 
< 0.1%
14.29281311 1
 
< 0.1%
14.28467196 1
 
< 0.1%

baths_clean
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct78
Distinct (%)< 0.1%
Missing102041
Missing (%)27.9%
Infinite0
Infinite (%)0.0%
Mean2.723222
Minimum0
Maximum76
Zeros3581
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2024-05-16T16:40:44.018668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median2.5
Q33
95-th percentile5
Maximum76
Range76
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.4458969
Coefficient of variation (CV)0.53095078
Kurtosis121.36787
Mean2.723222
Median Absolute Deviation (MAD)0.5
Skewness5.4607315
Sum717100.6
Variance2.0906177
MonotonicityNot monotonic
2024-05-16T16:40:44.225666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 101385
27.7%
3 65319
17.9%
4 25701
 
7.0%
1 17174
 
4.7%
2.5 13735
 
3.8%
5 9291
 
2.5%
3.5 6062
 
1.7%
1.5 4253
 
1.2%
6 4228
 
1.2%
0 3581
 
1.0%
Other values (68) 12599
 
3.4%
(Missing) 102041
27.9%
ValueCountFrequency (%)
0 3581
 
1.0%
0.5 1
 
< 0.1%
0.75 2
 
< 0.1%
1 17174
 
4.7%
1.1 13
 
< 0.1%
1.25 1186
 
0.3%
1.5 4253
 
1.2%
1.75 1787
 
0.5%
2 101385
27.7%
2.1 50
 
< 0.1%
ValueCountFrequency (%)
76 1
 
< 0.1%
68 1
 
< 0.1%
64 1
 
< 0.1%
60 1
 
< 0.1%
55 1
 
< 0.1%
44 3
< 0.1%
43 1
 
< 0.1%
42 1
 
< 0.1%
41 1
 
< 0.1%
40 4
< 0.1%

fireplace_booled
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.6 MiB
0
266863 
1
98506 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters365369
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 266863
73.0%
1 98506
 
27.0%

Length

2024-05-16T16:40:44.435667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-16T16:40:44.599670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 266863
73.0%
1 98506
 
27.0%

Most occurring characters

ValueCountFrequency (%)
0 266863
73.0%
1 98506
 
27.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 365369
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 266863
73.0%
1 98506
 
27.0%

Most occurring scripts

ValueCountFrequency (%)
Common 365369
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 266863
73.0%
1 98506
 
27.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 365369
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 266863
73.0%
1 98506
 
27.0%

population
Real number (ℝ)

HIGH CORRELATION 

Distinct4014
Distinct (%)1.1%
Missing866
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean33388.017
Minimum79
Maximum113916
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2024-05-16T16:40:44.768667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum79
5-th percentile9384
Q120507
median31099
Q343194
95-th percentile66551
Maximum113916
Range113837
Interquartile range (IQR)22687

Descriptive statistics

Standard deviation17679.901
Coefficient of variation (CV)0.52952833
Kurtosis1.2283742
Mean33388.017
Median Absolute Deviation (MAD)11232
Skewness0.90472715
Sum1.2170032 × 1010
Variance3.1257889 × 108
MonotonicityNot monotonic
2024-05-16T16:40:44.977667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37821 2104
 
0.6%
14917 1550
 
0.4%
13692 1474
 
0.4%
56511 1361
 
0.4%
19410 1298
 
0.4%
11165 1295
 
0.4%
29007 1252
 
0.3%
30170 1240
 
0.3%
44817 1211
 
0.3%
26108 1161
 
0.3%
Other values (4004) 350557
95.9%
ValueCountFrequency (%)
79 2
 
< 0.1%
105 1
 
< 0.1%
114 15
< 0.1%
153 18
< 0.1%
164 1
 
< 0.1%
169 1
 
< 0.1%
170 2
 
< 0.1%
179 1
 
< 0.1%
191 6
 
< 0.1%
197 1
 
< 0.1%
ValueCountFrequency (%)
113916 117
< 0.1%
111086 134
< 0.1%
109931 131
< 0.1%
105549 54
 
< 0.1%
103892 97
< 0.1%
103689 44
 
< 0.1%
101572 147
< 0.1%
100820 123
< 0.1%
99598 199
0.1%
98592 109
< 0.1%

lat
Real number (ℝ)

HIGH CORRELATION 

Distinct1508
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.47784
Minimum25.56
Maximum48.79
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2024-05-16T16:40:45.219668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum25.56
5-th percentile25.82
Q128.8
median32.78
Q338.78
95-th percentile43.04
Maximum48.79
Range23.23
Interquartile range (IQR)9.98

Descriptive statistics

Standard deviation5.9106505
Coefficient of variation (CV)0.17655412
Kurtosis-0.67189639
Mean33.47784
Median Absolute Deviation (MAD)4.38
Skewness0.54441271
Sum12231765
Variance34.93579
MonotonicityNot monotonic
2024-05-16T16:40:45.443671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25.85 3088
 
0.8%
25.77 2879
 
0.8%
25.78 2738
 
0.7%
29.8 2543
 
0.7%
25.82 2542
 
0.7%
29.57 2483
 
0.7%
29.74 2285
 
0.6%
29.47 2048
 
0.6%
29.4 2030
 
0.6%
29.77 1957
 
0.5%
Other values (1498) 340776
93.3%
ValueCountFrequency (%)
25.56 281
 
0.1%
25.57 202
 
0.1%
25.6 580
0.2%
25.61 542
0.1%
25.64 74
 
< 0.1%
25.65 713
0.2%
25.66 389
0.1%
25.67 664
0.2%
25.68 567
0.2%
25.7 955
0.3%
ValueCountFrequency (%)
48.79 140
< 0.1%
48.73 49
 
< 0.1%
48.68 133
< 0.1%
48.21 4
 
< 0.1%
48.18 1
 
< 0.1%
48.09 70
< 0.1%
48.05 142
< 0.1%
47.98 72
< 0.1%
47.95 8
 
< 0.1%
47.94 142
< 0.1%

lng
Real number (ℝ)

HIGH CORRELATION 

Distinct1955
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-90.487914
Minimum-123.4
Maximum-67.07
Zeros0
Zeros (%)0.0%
Negative365369
Negative (%)100.0%
Memory size5.6 MiB
2024-05-16T16:40:45.648669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-123.4
5-th percentile-121.88
Q1-97.34
median-83.65
Q3-80.41
95-th percentile-73.99
Maximum-67.07
Range56.33
Interquartile range (IQR)16.93

Descriptive statistics

Standard deviation13.975913
Coefficient of variation (CV)-0.15445061
Kurtosis-0.036098161
Mean-90.487914
Median Absolute Deviation (MAD)5.03
Skewness-1.0305054
Sum-33061479
Variance195.32616
MonotonicityNot monotonic
2024-05-16T16:40:45.859703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-80.18 5627
 
1.5%
-80.13 4111
 
1.1%
-80.14 3022
 
0.8%
-80.27 3004
 
0.8%
-98.73 2576
 
0.7%
-80.23 2497
 
0.7%
-80.24 2490
 
0.7%
-80.17 2304
 
0.6%
-73.95 2147
 
0.6%
-81.21 2109
 
0.6%
Other values (1945) 335482
91.8%
ValueCountFrequency (%)
-123.4 26
 
< 0.1%
-123.22 17
 
< 0.1%
-123.19 10
 
< 0.1%
-123.13 4
 
< 0.1%
-123.11 17
 
< 0.1%
-123.08 27
 
< 0.1%
-123.07 2
 
< 0.1%
-123.06 87
< 0.1%
-123.05 4
 
< 0.1%
-123.04 71
< 0.1%
ValueCountFrequency (%)
-67.07 1
 
< 0.1%
-67.16 46
< 0.1%
-69.55 1
 
< 0.1%
-69.6 63
< 0.1%
-69.71 81
< 0.1%
-69.79 43
< 0.1%
-70.22 2
 
< 0.1%
-70.29 13
 
< 0.1%
-70.3 4
 
< 0.1%
-70.34 7
 
< 0.1%

state_zip
Categorical

HIGH CORRELATION 

Distinct34
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.6 MiB
FL
113038 
TX
81563 
NY
23269 
CA
22989 
NC
21378 
Other values (29)
103132 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters730738
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNC
2nd rowWA
3rd rowCA
4th rowTX
5th rowFL

Common Values

ValueCountFrequency (%)
FL 113038
30.9%
TX 81563
22.3%
NY 23269
 
6.4%
CA 22989
 
6.3%
NC 21378
 
5.9%
TN 17655
 
4.8%
WA 13530
 
3.7%
OH 11743
 
3.2%
NV 8357
 
2.3%
IL 8102
 
2.2%
Other values (24) 43745
 
12.0%

Length

2024-05-16T16:40:46.055698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
fl 113038
30.9%
tx 81563
22.3%
ny 23269
 
6.4%
ca 22989
 
6.3%
nc 21378
 
5.9%
tn 17655
 
4.8%
wa 13530
 
3.7%
oh 11743
 
3.2%
nv 8357
 
2.3%
il 8102
 
2.2%
Other values (24) 43745
 
12.0%

Most occurring characters

ValueCountFrequency (%)
L 121141
16.6%
F 113038
15.5%
T 101253
13.9%
X 81563
11.2%
N 74214
10.2%
C 54927
7.5%
A 53756
7.4%
Y 23357
 
3.2%
O 21388
 
2.9%
I 16878
 
2.3%
Other values (14) 69223
9.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 730738
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L 121141
16.6%
F 113038
15.5%
T 101253
13.9%
X 81563
11.2%
N 74214
10.2%
C 54927
7.5%
A 53756
7.4%
Y 23357
 
3.2%
O 21388
 
2.9%
I 16878
 
2.3%
Other values (14) 69223
9.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 730738
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
L 121141
16.6%
F 113038
15.5%
T 101253
13.9%
X 81563
11.2%
N 74214
10.2%
C 54927
7.5%
A 53756
7.4%
Y 23357
 
3.2%
O 21388
 
2.9%
I 16878
 
2.3%
Other values (14) 69223
9.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 730738
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
L 121141
16.6%
F 113038
15.5%
T 101253
13.9%
X 81563
11.2%
N 74214
10.2%
C 54927
7.5%
A 53756
7.4%
Y 23357
 
3.2%
O 21388
 
2.9%
I 16878
 
2.3%
Other values (14) 69223
9.5%

median_household_income
Real number (ℝ)

HIGH CORRELATION 

Distinct4016
Distinct (%)1.1%
Missing866
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean57331.753
Minimum9106
Maximum230952
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2024-05-16T16:40:46.248666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9106
5-th percentile27180
Q141410
median52316
Q368977
95-th percentile103309
Maximum230952
Range221846
Interquartile range (IQR)27567

Descriptive statistics

Standard deviation23358.446
Coefficient of variation (CV)0.40742599
Kurtosis2.362639
Mean57331.753
Median Absolute Deviation (MAD)13211
Skewness1.2034324
Sum2.0897596 × 1010
Variance5.4561701 × 108
MonotonicityNot monotonic
2024-05-16T16:40:46.450691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
51153 2104
 
0.6%
78094 1550
 
0.4%
64823 1474
 
0.4%
59785 1361
 
0.4%
48959 1298
 
0.4%
57843 1295
 
0.4%
85508 1252
 
0.3%
42163 1240
 
0.3%
76649 1211
 
0.3%
22813 1161
 
0.3%
Other values (4006) 350557
95.9%
ValueCountFrequency (%)
9106 8
 
< 0.1%
9475 1
 
< 0.1%
9954 19
 
< 0.1%
10422 10
 
< 0.1%
12143 104
< 0.1%
12149 1
 
< 0.1%
12457 30
 
< 0.1%
12534 6
 
< 0.1%
13350 3
 
< 0.1%
13415 9
 
< 0.1%
ValueCountFrequency (%)
230952 10
 
< 0.1%
216037 111
< 0.1%
214219 2
 
< 0.1%
196637 1
 
< 0.1%
192648 7
 
< 0.1%
187768 3
 
< 0.1%
183966 3
 
< 0.1%
183833 10
 
< 0.1%
180583 1
 
< 0.1%
180540 9
 
< 0.1%

housing_units
Real number (ℝ)

HIGH CORRELATION 

Distinct3807
Distinct (%)1.0%
Missing866
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean14793.575
Minimum38
Maximum47617
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2024-05-16T16:40:46.636667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum38
5-th percentile4301
Q19817
median14089
Q319008
95-th percentile27877
Maximum47617
Range47579
Interquartile range (IQR)9191

Descriptive statistics

Standard deviation7107.8074
Coefficient of variation (CV)0.48046585
Kurtosis0.64072833
Mean14793.575
Median Absolute Deviation (MAD)4502
Skewness0.6787794
Sum5.3923024 × 109
Variance50520927
MonotonicityNot monotonic
2024-05-16T16:40:46.821667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20744 2104
 
0.6%
13481 1550
 
0.4%
20571 1474
 
0.4%
9817 1424
 
0.4%
18749 1361
 
0.4%
10825 1298
 
0.4%
7504 1295
 
0.4%
13256 1247
 
0.3%
15250 1211
 
0.3%
14259 1161
 
0.3%
Other values (3797) 350378
95.9%
ValueCountFrequency (%)
38 2
 
< 0.1%
59 1
 
< 0.1%
63 1
 
< 0.1%
88 1
 
< 0.1%
96 2
 
< 0.1%
107 1
 
< 0.1%
110 1
 
< 0.1%
113 1
 
< 0.1%
128 2
 
< 0.1%
130 6
< 0.1%
ValueCountFrequency (%)
47617 191
0.1%
41483 228
0.1%
39547 161
 
< 0.1%
39402 364
0.1%
38453 154
 
< 0.1%
37951 306
0.1%
37745 147
 
< 0.1%
37619 434
0.1%
37598 95
 
< 0.1%
37432 67
 
< 0.1%

occupied_housing_units
Real number (ℝ)

HIGH CORRELATION 

Distinct3769
Distinct (%)1.0%
Missing866
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean12960.053
Minimum33
Maximum44432
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2024-05-16T16:40:47.026697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum33
5-th percentile3705
Q18506
median12354
Q316556
95-th percentile24404
Maximum44432
Range44399
Interquartile range (IQR)8050

Descriptive statistics

Standard deviation6437.3506
Coefficient of variation (CV)0.49670713
Kurtosis0.85429206
Mean12960.053
Median Absolute Deviation (MAD)3997
Skewness0.74138286
Sum4.723978 × 109
Variance41439482
MonotonicityNot monotonic
2024-05-16T16:40:47.239669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16012 2104
 
0.6%
8775 1574
 
0.4%
5150 1474
 
0.4%
17691 1361
 
0.4%
5672 1329
 
0.4%
8731 1298
 
0.4%
9435 1252
 
0.3%
10181 1240
 
0.3%
14708 1211
 
0.3%
12087 1161
 
0.3%
Other values (3759) 350499
95.9%
ValueCountFrequency (%)
33 2
 
< 0.1%
53 1
 
< 0.1%
55 1
 
< 0.1%
58 15
< 0.1%
73 1
 
< 0.1%
76 1
 
< 0.1%
81 7
 
< 0.1%
85 2
 
< 0.1%
87 2
 
< 0.1%
92 18
< 0.1%
ValueCountFrequency (%)
44432 191
0.1%
37865 228
0.1%
36101 147
< 0.1%
35524 67
 
< 0.1%
35407 161
< 0.1%
34850 306
0.1%
34843 109
 
< 0.1%
34452 134
 
< 0.1%
34383 364
0.1%
34330 154
< 0.1%

major_city_ch
Categorical

HIGH CORRELATION 

Distinct26
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.6 MiB
others
167072 
Houston
24054 
Miami
21292 
San Antonio
 
15918
Fort Lauderdale
 
12060
Other values (21)
124973 

Length

Max length16
Median length15
Mean length7.4823261
Min length5

Characters and Unicode

Total characters2733810
Distinct characters40
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowothers
2nd rowothers
3rd rowLos Angeles
4th rowDallas
5th rowothers

Common Values

ValueCountFrequency (%)
others 167072
45.7%
Houston 24054
 
6.6%
Miami 21292
 
5.8%
San Antonio 15918
 
4.4%
Fort Lauderdale 12060
 
3.3%
Jacksonville 9612
 
2.6%
Dallas 8822
 
2.4%
Cleveland 7087
 
1.9%
Orlando 7012
 
1.9%
Brooklyn 6901
 
1.9%
Other values (16) 85539
23.4%

Length

2024-05-16T16:40:47.441666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
others 167072
39.3%
miami 25054
 
5.9%
houston 24054
 
5.7%
fort 17890
 
4.2%
san 15918
 
3.7%
antonio 15918
 
3.7%
lauderdale 12060
 
2.8%
jacksonville 9612
 
2.3%
dallas 8822
 
2.1%
cleveland 7087
 
1.7%
Other values (22) 121329
28.6%

Most occurring characters

ValueCountFrequency (%)
o 329213
12.0%
e 287003
10.5%
t 283775
10.4%
s 260743
9.5%
r 240301
8.8%
h 207950
 
7.6%
a 186411
 
6.8%
n 130723
 
4.8%
i 121921
 
4.5%
l 116560
 
4.3%
Other values (30) 569210
20.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2416619
88.4%
Uppercase Letter 257744
 
9.4%
Space Separator 59447
 
2.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 329213
13.6%
e 287003
11.9%
t 283775
11.7%
s 260743
10.8%
r 240301
9.9%
h 207950
8.6%
a 186411
7.7%
n 130723
 
5.4%
i 121921
 
5.0%
l 116560
 
4.8%
Other values (11) 252019
10.4%
Uppercase Letter
ValueCountFrequency (%)
A 30352
11.8%
S 25904
10.1%
M 25054
9.7%
H 24054
9.3%
L 22178
 
8.6%
C 19945
 
7.7%
F 17890
 
6.9%
N 11537
 
4.5%
B 10663
 
4.1%
W 10225
 
4.0%
Other values (8) 59942
23.3%
Space Separator
ValueCountFrequency (%)
59447
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2674363
97.8%
Common 59447
 
2.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 329213
12.3%
e 287003
10.7%
t 283775
10.6%
s 260743
9.7%
r 240301
9.0%
h 207950
7.8%
a 186411
 
7.0%
n 130723
 
4.9%
i 121921
 
4.6%
l 116560
 
4.4%
Other values (29) 509763
19.1%
Common
ValueCountFrequency (%)
59447
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2733810
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 329213
12.0%
e 287003
10.5%
t 283775
10.4%
s 260743
9.5%
r 240301
8.8%
h 207950
 
7.6%
a 186411
 
6.8%
n 130723
 
4.8%
i 121921
 
4.5%
l 116560
 
4.3%
Other values (30) 569210
20.8%

status_cl
Categorical

IMBALANCE 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.6 MiB
active
298185 
not defined
39990 
foreclosure
 
9411
new construction
 
5352
pending
 
5217
Other values (5)
 
7214

Length

Max length16
Median length6
Mean length6.93552
Min length6

Characters and Unicode

Total characters2534024
Distinct characters19
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowactive
2nd rowactive
3rd rowactive
4th rowactive
5th rowactive

Common Values

ValueCountFrequency (%)
active 298185
81.6%
not defined 39990
 
10.9%
foreclosure 9411
 
2.6%
new construction 5352
 
1.5%
pending 5217
 
1.4%
under contract 3710
 
1.0%
auction 1308
 
0.4%
contingent 1066
 
0.3%
others 722
 
0.2%
for rent 408
 
0.1%

Length

2024-05-16T16:40:47.645670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-16T16:40:47.828696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
active 298185
71.9%
not 39990
 
9.6%
defined 39990
 
9.6%
foreclosure 9411
 
2.3%
new 5352
 
1.3%
construction 5352
 
1.3%
pending 5217
 
1.3%
under 3710
 
0.9%
contract 3710
 
0.9%
auction 1308
 
0.3%
Other values (4) 2604
 
0.6%

Most occurring characters

ValueCountFrequency (%)
e 413462
16.3%
t 360869
14.2%
i 351118
13.9%
c 328094
12.9%
a 303203
12.0%
v 298185
11.8%
n 118804
 
4.7%
d 88907
 
3.5%
o 76730
 
3.0%
f 49809
 
2.0%
Other values (9) 144843
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2484564
98.0%
Space Separator 49460
 
2.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 413462
16.6%
t 360869
14.5%
i 351118
14.1%
c 328094
13.2%
a 303203
12.2%
v 298185
12.0%
n 118804
 
4.8%
d 88907
 
3.6%
o 76730
 
3.1%
f 49809
 
2.0%
Other values (8) 95383
 
3.8%
Space Separator
ValueCountFrequency (%)
49460
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2484564
98.0%
Common 49460
 
2.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 413462
16.6%
t 360869
14.5%
i 351118
14.1%
c 328094
13.2%
a 303203
12.2%
v 298185
12.0%
n 118804
 
4.8%
d 88907
 
3.6%
o 76730
 
3.1%
f 49809
 
2.0%
Other values (8) 95383
 
3.8%
Common
ValueCountFrequency (%)
49460
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2534024
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 413462
16.3%
t 360869
14.2%
i 351118
13.9%
c 328094
12.9%
a 303203
12.0%
v 298185
11.8%
n 118804
 
4.7%
d 88907
 
3.5%
o 76730
 
3.0%
f 49809
 
2.0%
Other values (9) 144843
 
5.7%
Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.6 MiB
single-family
192922 
condo
50155 
na
33361 
land
29397 
townhouse
 
17812
Other values (6)
41722 

Length

Max length13
Median length13
Mean length9.420063
Min length2

Characters and Unicode

Total characters3441799
Distinct characters21
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsingle-family
2nd rowsingle-family
3rd rowsingle-family
4th rowsingle-family
5th rowland

Common Values

ValueCountFrequency (%)
single-family 192922
52.8%
condo 50155
 
13.7%
na 33361
 
9.1%
land 29397
 
8.0%
townhouse 17812
 
4.9%
others 14603
 
4.0%
multi-family 11332
 
3.1%
traditional 6036
 
1.7%
coop 3792
 
1.0%
mobile 3459
 
0.9%

Length

2024-05-16T16:40:48.045702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
single-family 192922
52.8%
condo 50155
 
13.7%
na 33361
 
9.1%
land 29397
 
8.0%
townhouse 17812
 
4.9%
others 14603
 
4.0%
multi-family 11332
 
3.1%
traditional 6036
 
1.7%
coop 3792
 
1.0%
mobile 3459
 
0.9%

Most occurring characters

ValueCountFrequency (%)
l 447400
13.0%
i 424039
12.3%
n 332183
9.7%
a 281584
8.2%
e 228796
 
6.6%
s 225337
 
6.5%
m 219045
 
6.4%
y 204254
 
5.9%
f 204254
 
5.9%
- 204254
 
5.9%
Other values (11) 670653
19.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3237545
94.1%
Dash Punctuation 204254
 
5.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 447400
13.8%
i 424039
13.1%
n 332183
10.3%
a 281584
8.7%
e 228796
7.1%
s 225337
7.0%
m 219045
6.8%
y 204254
6.3%
f 204254
6.3%
g 192922
6.0%
Other values (10) 477731
14.8%
Dash Punctuation
ValueCountFrequency (%)
- 204254
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3237545
94.1%
Common 204254
 
5.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 447400
13.8%
i 424039
13.1%
n 332183
10.3%
a 281584
8.7%
e 228796
7.1%
s 225337
7.0%
m 219045
6.8%
y 204254
6.3%
f 204254
6.3%
g 192922
6.0%
Other values (10) 477731
14.8%
Common
ValueCountFrequency (%)
- 204254
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3441799
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 447400
13.0%
i 424039
12.3%
n 332183
9.7%
a 281584
8.2%
e 228796
 
6.6%
s 225337
 
6.5%
m 219045
 
6.4%
y 204254
 
5.9%
f 204254
 
5.9%
- 204254
 
5.9%
Other values (11) 670653
19.5%

home_age
Real number (ℝ)

ZEROS 

Distinct123
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.558143
Minimum0
Maximum122
Zeros29405
Zeros (%)8.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2024-05-16T16:40:48.236671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q111
median36
Q358
95-th percentile99
Maximum122
Range122
Interquartile range (IQR)47

Descriptive statistics

Standard deviation31.084398
Coefficient of variation (CV)0.82763407
Kurtosis-0.26544588
Mean37.558143
Median Absolute Deviation (MAD)23
Skewness0.73884158
Sum13722581
Variance966.23983
MonotonicityNot monotonic
2024-05-16T16:40:48.458668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
36 37467
 
10.3%
3 30899
 
8.5%
0 29405
 
8.0%
16 7908
 
2.2%
17 7410
 
2.0%
15 7026
 
1.9%
4 6669
 
1.8%
18 5449
 
1.5%
5 5002
 
1.4%
6 4966
 
1.4%
Other values (113) 223168
61.1%
ValueCountFrequency (%)
0 29405
8.0%
1 55
 
< 0.1%
2 2261
 
0.6%
3 30899
8.5%
4 6669
 
1.8%
5 5002
 
1.4%
6 4966
 
1.4%
7 3802
 
1.0%
8 3044
 
0.8%
9 2290
 
0.6%
ValueCountFrequency (%)
122 2570
0.7%
121 536
 
0.1%
120 140
 
< 0.1%
119 169
 
< 0.1%
118 196
 
0.1%
117 686
 
0.2%
116 330
 
0.1%
115 315
 
0.1%
114 420
 
0.1%
113 361
 
0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.6 MiB
0
217166 
1
148203 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters365369
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 217166
59.4%
1 148203
40.6%

Length

2024-05-16T16:40:48.690669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-16T16:40:48.852669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 217166
59.4%
1 148203
40.6%

Most occurring characters

ValueCountFrequency (%)
0 217166
59.4%
1 148203
40.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 365369
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 217166
59.4%
1 148203
40.6%

Most occurring scripts

ValueCountFrequency (%)
Common 365369
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 217166
59.4%
1 148203
40.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 365369
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 217166
59.4%
1 148203
40.6%

heating_system
Categorical

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.6 MiB
AIR
145200 
NO DATA
89853 
OTHER
60207 
NO HEATING NEED
29397 
ELECTRIC
24730 
Other values (2)
15982 

Length

Max length15
Median length8
Mean length5.6425833
Min length3

Characters and Unicode

Total characters2061625
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOTHER
2nd rowNO DATA
3rd rowAIR
4th rowAIR
5th rowNO HEATING NEED

Common Values

ValueCountFrequency (%)
AIR 145200
39.7%
NO DATA 89853
24.6%
OTHER 60207
16.5%
NO HEATING NEED 29397
 
8.0%
ELECTRIC 24730
 
6.8%
GAS 11343
 
3.1%
MULTI 4639
 
1.3%

Length

2024-05-16T16:40:49.013668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-16T16:40:49.178666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
air 145200
28.2%
no 119250
23.2%
data 89853
17.5%
other 60207
11.7%
heating 29397
 
5.7%
need 29397
 
5.7%
electric 24730
 
4.8%
gas 11343
 
2.2%
multi 4639
 
0.9%

Most occurring characters

ValueCountFrequency (%)
A 365646
17.7%
R 230137
11.2%
T 208826
10.1%
I 203966
9.9%
E 197858
9.6%
O 179457
8.7%
N 178044
8.6%
148647
7.2%
D 119250
 
5.8%
H 89604
 
4.3%
Other values (6) 140190
 
6.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1912978
92.8%
Space Separator 148647
 
7.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 365646
19.1%
R 230137
12.0%
T 208826
10.9%
I 203966
10.7%
E 197858
10.3%
O 179457
9.4%
N 178044
9.3%
D 119250
 
6.2%
H 89604
 
4.7%
C 49460
 
2.6%
Other values (5) 90730
 
4.7%
Space Separator
ValueCountFrequency (%)
148647
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1912978
92.8%
Common 148647
 
7.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 365646
19.1%
R 230137
12.0%
T 208826
10.9%
I 203966
10.7%
E 197858
10.3%
O 179457
9.4%
N 178044
9.3%
D 119250
 
6.2%
H 89604
 
4.7%
C 49460
 
2.6%
Other values (5) 90730
 
4.7%
Common
ValueCountFrequency (%)
148647
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2061625
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 365646
17.7%
R 230137
11.2%
T 208826
10.1%
I 203966
9.9%
E 197858
9.6%
O 179457
8.7%
N 178044
8.6%
148647
7.2%
D 119250
 
5.8%
H 89604
 
4.3%
Other values (6) 140190
 
6.8%

mean_rating
Real number (ℝ)

HIGH CORRELATION 

Distinct231
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.9413036
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2024-05-16T16:40:49.384669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13.5
median5
Q36.3333333
95-th percentile8
Maximum9
Range8
Interquartile range (IQR)2.8333333

Descriptive statistics

Standard deviation1.8305522
Coefficient of variation (CV)0.37045936
Kurtosis-0.67126517
Mean4.9413036
Median Absolute Deviation (MAD)1.3333333
Skewness0.19531527
Sum1805399.1
Variance3.3509212
MonotonicityNot monotonic
2024-05-16T16:40:49.609273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5 23331
 
6.4%
4 21630
 
5.9%
6 20119
 
5.5%
3 19008
 
5.2%
6.333333333 14849
 
4.1%
4.666666667 14700
 
4.0%
3.333333333 14687
 
4.0%
7 13872
 
3.8%
5.666666667 13675
 
3.7%
2 13314
 
3.6%
Other values (221) 196184
53.7%
ValueCountFrequency (%)
1 2068
0.6%
1.2 1
 
< 0.1%
1.25 17
 
< 0.1%
1.333333333 965
0.3%
1.4 10
 
< 0.1%
1.444444444 1
 
< 0.1%
1.5 1821
0.5%
1.555555556 4
 
< 0.1%
1.6 60
 
< 0.1%
1.625 4
 
< 0.1%
ValueCountFrequency (%)
9 7059
1.9%
8.833333333 146
 
< 0.1%
8.8 8
 
< 0.1%
8.75 86
 
< 0.1%
8.666666667 2336
 
0.6%
8.6 169
 
< 0.1%
8.5 2980
0.8%
8.4 275
 
0.1%
8.333333333 2788
 
0.8%
8.25 970
 
0.3%

min_dist_school
Real number (ℝ)

Distinct1529
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1990203
Minimum0
Maximum48.74
Zeros747
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size5.6 MiB
2024-05-16T16:40:49.829243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.13
Q10.37
median0.7
Q31.3
95-th percentile3.65
Maximum48.74
Range48.74
Interquartile range (IQR)0.93

Descriptive statistics

Standard deviation2.0167967
Coefficient of variation (CV)1.6820372
Kurtosis117.34197
Mean1.1990203
Median Absolute Deviation (MAD)0.4
Skewness8.670267
Sum438084.84
Variance4.0674691
MonotonicityNot monotonic
2024-05-16T16:40:50.030273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.3 21717
 
5.9%
0.4 20311
 
5.6%
0.2 19527
 
5.3%
0.5 18232
 
5.0%
0.6 15616
 
4.3%
0.7 13408
 
3.7%
0.1 12080
 
3.3%
0.8 10758
 
2.9%
0.9 9362
 
2.6%
1.1 7333
 
2.0%
Other values (1519) 217025
59.4%
ValueCountFrequency (%)
0 747
0.2%
0.01 1
 
< 0.1%
0.02 21
 
< 0.1%
0.03 111
 
< 0.1%
0.04 178
 
< 0.1%
0.05 317
0.1%
0.06 380
0.1%
0.07 485
0.1%
0.08 520
0.1%
0.09 654
0.2%
ValueCountFrequency (%)
48.74 1
 
< 0.1%
45.13 3
 
< 0.1%
40.65 1
 
< 0.1%
40.3 1
 
< 0.1%
39.69 180
< 0.1%
39.35 1
 
< 0.1%
37.86 1
 
< 0.1%
36.3 1
 
< 0.1%
33.7 1
 
< 0.1%
33.6 2
 
< 0.1%

Interactions

2024-05-16T16:40:34.434669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:49.387839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:52.380832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:55.471804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:58.702802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:01.772831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:04.804830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:08.179805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:11.795831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:15.285806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:18.451832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:21.767803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:24.821667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:27.887704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:31.189668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:34.744672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:49.583803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:52.565832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:55.693832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:58.892836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:02.055804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:04.977830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:08.391804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:12.032818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:15.486804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:18.660836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:21.961802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:25.054697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:28.085697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:31.393696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:35.089666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:49.785804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:52.773832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:55.959801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:59.097806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:02.333802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:05.171831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:08.615804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:12.336804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:15.709835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:18.875804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:22.165802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:25.261666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:28.323669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:31.616668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:35.301696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:49.987831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:52.977801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:56.161805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:59.311837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:02.542807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:05.357855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:08.836836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:12.593804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:15.928838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:19.083802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:22.367801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:25.484668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:28.733669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:31.868668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:35.519667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:50.197831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:53.173837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:56.386831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:59.508802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:02.733836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:05.543802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:09.060831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:12.836805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:16.132803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:19.290831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:22.572833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:25.700670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:28.963697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:32.085671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:35.763669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:50.406801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:53.387837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:56.590804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:59.695836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:02.917805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:05.726803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:09.281807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:13.067806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:16.352831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:19.494831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:22.820832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:25.946668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:29.163670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:32.286668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:35.962669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:50.603839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:53.585801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:56.792801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:59.896801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:03.112832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:05.905832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:09.496805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:13.325805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:16.598802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:19.697834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:23.019831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:26.146668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:29.374667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:32.527704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:36.158695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:50.783832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:53.797807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:56.995831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:00.081832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:03.298836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:06.081801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:09.743801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:13.553803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:16.805804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:19.887804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:23.211803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:26.356667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:29.607666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:32.730670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:36.343669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:50.960831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:53.997826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:57.246804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:00.395804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:03.478831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:06.248801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:09.932833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:13.767807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:17.023832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:20.070829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:23.394833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:26.539666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:29.803704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:32.922668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:36.541667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:51.143832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:54.198804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:57.525803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:00.607803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:03.668801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:06.422832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:10.210804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:13.986804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:17.215801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:20.274803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:23.583804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:26.721702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:29.994700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:33.116688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:36.737666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:51.322802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:54.397801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:57.721802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:00.803805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:03.852802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:06.721802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:10.483804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:14.187806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:17.409801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:20.590805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:23.773837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:26.907696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:30.190666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:33.349666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:36.932668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:51.499838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:54.589831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:57.905801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:00.982805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:04.037838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:06.924816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:10.744802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:14.388831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:17.606835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:20.883801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:23.958831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:27.095696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:30.379697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:33.590671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:37.121669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:51.678806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:54.784834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:58.094836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:01.165805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:04.224839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:07.109806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:10.993804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:14.612803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:17.811802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:21.106804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:24.149701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:27.287667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:30.568673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:33.809702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:37.326668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:51.999802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:54.978831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:58.302813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:01.392803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:04.430832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:07.384805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:11.274805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:14.825804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:18.039808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:21.323831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:24.409666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:27.481669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:30.764667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:34.008666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:37.537666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:52.208804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:55.273802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:39:58.517832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:01.587802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:04.630803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:07.656804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:11.555804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:15.070804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:18.260831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:21.559831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:24.622667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:27.698668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:30.982667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-16T16:40:34.223696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-16T16:40:50.193279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
zipcodetarget_cleanlog_target_cleansqft_cleanlog_sqft_cleanbaths_cleanpopulationlatlngmedian_household_incomehousing_unitsoccupied_housing_unitshome_agemean_ratingmin_dist_schoolprivate_pool_joinmlsid_join_boolfireplace_booledstate_zipmajor_city_chstatus_clproperty_type_finremodeled_year_boolheating_system
zipcode1.0000.0120.0120.1300.1300.030-0.0360.122-0.9360.111-0.148-0.101-0.0640.074-0.0340.1420.0840.2760.9170.6610.1210.1780.2770.180
target_clean0.0121.0001.0000.5060.5060.5310.0000.080-0.0030.4360.0490.064-0.0160.313-0.1040.0350.0010.0090.0220.0450.0000.0100.0070.010
log_target_clean0.0121.0001.0000.5060.5060.5310.0000.080-0.0030.4360.0490.064-0.0160.313-0.1040.1590.1620.1600.1950.1650.1370.1710.1640.189
sqft_clean0.1300.5060.5061.0001.0000.7130.0250.004-0.1370.276-0.0070.0160.0010.2210.0310.0000.0030.0030.0240.0000.0000.0150.0070.007
log_sqft_clean0.1300.5060.5061.0001.0000.7130.0250.004-0.1370.276-0.0070.0160.0010.2210.0310.1870.1080.3170.0920.1040.0400.2950.2910.280
baths_clean0.0300.5310.5310.7130.7131.0000.0030.032-0.0510.287-0.0030.008-0.2230.2430.0610.1030.0260.0360.0260.0410.0070.0490.0260.011
population-0.0360.0000.0000.0250.0250.0031.0000.1370.029-0.1040.8850.9390.127-0.020-0.2670.0540.0910.0870.2060.2090.0360.0900.0990.083
lat0.1220.0800.0800.0040.0040.0320.1371.000-0.1910.0800.0930.1610.1790.005-0.2760.1700.0800.3220.7610.6200.1250.1520.2920.179
lng-0.936-0.003-0.003-0.137-0.137-0.0510.029-0.1911.000-0.1160.1500.0920.091-0.0920.0270.1240.0810.2600.8280.5370.0980.1560.2270.161
median_household_income0.1110.4360.4360.2760.2760.287-0.1040.080-0.1161.000-0.106-0.062-0.1670.5970.0650.1230.0600.1570.1620.1890.0400.0720.0910.073
housing_units-0.1480.0490.049-0.007-0.007-0.0030.8850.0930.150-0.1061.0000.9650.147-0.026-0.2180.0410.0490.0710.2070.2550.0290.1010.1250.085
occupied_housing_units-0.1010.0640.0640.0160.0160.0080.9390.1610.092-0.0620.9651.0000.1550.009-0.2610.0510.0630.0880.2070.2370.0360.1100.1240.100
home_age-0.064-0.016-0.0160.0010.001-0.2230.1270.1790.091-0.1670.1470.1551.000-0.167-0.2850.1830.1680.1680.1730.1660.0750.2330.4080.257
mean_rating0.0740.3130.3130.2210.2210.243-0.0200.005-0.0920.597-0.0260.009-0.1671.0000.0960.1090.0680.1280.1820.2050.0500.0690.0840.062
min_dist_school-0.034-0.104-0.1040.0310.0310.061-0.267-0.2760.0270.065-0.218-0.261-0.2850.0961.0000.0250.0250.0390.0420.0580.0240.0360.0620.043
private_pool_join0.1420.0350.1590.0000.1870.1030.0540.1700.1240.1230.0410.0510.1830.1090.0251.0000.0360.0610.2120.1690.0950.2330.1880.203
mlsid_join_bool0.0840.0010.1620.0030.1080.0260.0910.0800.0810.0600.0490.0630.1680.0680.0250.0361.0000.1060.1360.1680.4460.2200.0880.334
fireplace_booled0.2760.0090.1600.0030.3170.0360.0870.3220.2600.1570.0710.0880.1680.1280.0390.0610.1061.0000.3470.2770.1020.2970.1200.260
state_zip0.9170.0220.1950.0240.0920.0260.2060.7610.8280.1620.2070.2070.1730.1820.0420.2120.1360.3471.0000.5330.1550.2050.3520.241
major_city_ch0.6610.0450.1650.0000.1040.0410.2090.6200.5370.1890.2550.2370.1660.2050.0580.1690.1680.2770.5331.0000.1800.2320.4290.268
status_cl0.1210.0000.1370.0000.0400.0070.0360.1250.0980.0400.0290.0360.0750.0500.0240.0950.4460.1020.1550.1801.0000.0840.2010.114
property_type_fin0.1780.0100.1710.0150.2950.0490.0900.1520.1560.0720.1010.1100.2330.0690.0360.2330.2200.2970.2050.2320.0841.0000.2610.458
remodeled_year_bool0.2770.0070.1640.0070.2910.0260.0990.2920.2270.0910.1250.1240.4080.0840.0620.1880.0880.1200.3520.4290.2010.2611.0000.390
heating_system0.1800.0100.1890.0070.2800.0110.0830.1790.1610.0730.0850.1000.2570.0620.0430.2030.3340.2600.2410.2680.1140.4580.3901.000

Missing values

2024-05-16T16:40:37.828670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-16T16:40:38.753669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-16T16:40:40.002668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

zipcodetarget_cleanlog_target_cleanprivate_pool_joinmlsid_join_boolsqft_cleanlog_sqft_cleanbaths_cleanfireplace_booledpopulationlatlngstate_zipmedian_household_incomehousing_unitsoccupied_housing_unitsmajor_city_chstatus_clproperty_type_finhome_ageremodeled_year_boolheating_systemmean_ratingmin_dist_school
028387418000.012.9432370129007.9728113.5113921.035.18-79.40NC47982.07608.06612.0othersactivesingle-family3.00OTHER5.2000002.700000
199216310000.012.6443280119477.5745583.0024362.047.69-117.19WA45098.010906.010144.0othersactivesingle-family3.00NO DATA4.0000001.010000
2900492895000.014.8784961130008.0067012.0135482.034.08-118.49CA110854.018097.016657.0Los Angelesactivesingle-family61.01AIR6.6666671.190000
3752052395000.014.6888940164578.7730758.0123061.032.84-96.80TX108913.09985.09016.0Dallasactivesingle-family16.01AIR9.0000000.100000
4329085000.08.5171930100.000000NaN010892.027.96-80.70FL46466.04230.03657.0othersactiveland0.00NO HEATING NEED4.6666673.030000
519145209000.012.250090018976.800170NaN047261.039.91-75.20PA35761.020874.018802.0Philadelphiaactivetownhouse102.00AIR4.9388001.232491
634759181500.012.1090110115077.318540NaN030170.028.08-81.44FL42163.013256.010181.0Kissimmeeactiveothers16.01ELECTRIC2.3333330.800000
73811568000.011.1272630100.000000NaN039129.035.05-89.86TN29230.018726.015409.0othersactivena46.00NO DATA2.6666670.400000
850401244900.012.4086050135888.1856292.0029837.043.15-93.19IA42458.014147.013103.0othersactivesingle-family52.00AIR3.8000005.600000
977080311995.012.6507420119307.5657933.0045275.029.82-95.52TX38159.016376.014741.0Houstonnot definedsingle-family3.00GAS3.0000000.600000
zipcodetarget_cleanlog_target_cleanprivate_pool_joinmlsid_join_boolsqft_cleanlog_sqft_cleanbaths_cleanfireplace_booledpopulationlatlngstate_zipmedian_household_incomehousing_unitsoccupied_housing_unitsmajor_city_chstatus_clproperty_type_finhome_ageremodeled_year_boolheating_systemmean_ratingmin_dist_school
37440078212799500.013.591742119506.857514NaN028415.029.45-98.50TX36613.013137.011544.0San Antonioactivesingle-family84.01OTHER4.0000000.25
37440177080280000.012.5425450117927.4916453.0145275.029.82-95.52TX38159.016376.014741.0Houstonactivesingle-family52.01OTHER2.6666670.19
37440232805171306.012.0512070018297.5120712.0121810.028.53-81.40FL23950.09765.08278.0Orlandonot definedsingle-family60.01AIR2.3333331.10
37440376110199900.012.2055730118957.547502NaN030434.032.71-97.34TX39600.011027.09764.0Fort Worthactivesingle-family101.00NO DATA5.0000000.50
37440477089252990.012.4411050018417.5186072.0048685.029.59-95.23TX63141.016805.015902.0Houstonnot definedsingle-family3.00NO DATA6.0000000.30
37440520001799000.013.5911160114177.2570033.0038551.038.91-77.02DC78848.018751.016500.0Washingtonactivecondo12.00AIR3.0000000.10
374406331801249000.014.0378541140178.2985406.0030840.025.96-80.14FL68317.020316.014197.0Miaminot definedsingle-family32.01OTHER5.0000001.10
37440760657674999.013.4224660120007.6014023.0065996.041.94-87.65IL75885.041483.037865.0Chicagoactivecondo98.00OTHER4.3333330.40
37440811434528000.013.1768520011527.0501233.0059129.040.68-73.78NY59229.021681.020244.0othersactivesingle-family72.01OTHER4.5000000.48
37440978218204900.012.2302770114627.2882442.0031917.029.49-98.40TX38812.013900.012367.0San Antonionot definedsingle-family3.00ELECTRIC4.0000000.30